NOV-02-2004 TUE 01:32 PM MARGER JOHNSON 



FAX NO. 5032744622 



P. 



IN THE CLAIMS 

1 (Curremly Amended) A method for determhung dominant phrase vectors in . 
topologieta vector space for a semantic content of a doctrntent on a computer system, the 

method comprisme: . „ 

;.....lft,in., a dire^. ~t of concrp f^ a s . dict i opary. tlmi ir^t^- t ^« mcM-nR J 
......n. and at .east .p. rnnc.p, n.d a. -.^^ rh^in frptp th e .n .vim.1 element 

tA f^vftry concept; 

^ cnh^t c^ - V . Mcj. for the rijpftonatY; 

accessing dominant phrases for the document, the dominant phrases representmg a 
condensed content for the document; in t>,^ 

^^^^^^.^^.^^ each domiMntl Lbn ^^ rPpres.m«^d .meaghcbm^ 

hfl^i ft and tb y dictionary; ^ 

coiLctingatTeastonestatevectorinthetopological 

--r-'-- ^---^»^^oft^c IV rnrb rinniin.m phr..e js ^pres^nted^ 

chain in the d ictionary and t^basis; and 

collecting the state vectors into the dominant phrase vectors for the document. 

2 (Original) A method according to claim 1. wherein accessing dominant 
phrases includes extracting the dominant phrases from the document using a phrase extractor. 

3 (Original) A method according to claim 1 . wherein accessing dominant 
phrases includes storing the dominant phrases in computer memory accessible by the 

computer system. 

4. (Original) A method according to claim 1. the method further comprising 
forming a semantic abstract comprising the dominant phrase vectors. 

5 (Currently Amended) A method for determining dominant vectors in a 
topological vector space forasemantic content ofadocumentonacomputer system, the 

method comprising: . 
j.^,ifV,^ r . .i^cted set of con ce mi : ■< ^tr Tinn^rV ^■'"^'.-^^ ^ 
.......n. a n d ,„ V n- , op . ro nr,.P., and nt i rr i -^m »e ma^im aletoSPt 

i^ rf ^verv concepl; 

c^l.f.Hna a su h^^> th. rh^lns to form a twjs for T|ie .lictiQnarv; 
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s»ri„g to doc.u»«« in «.mpu»r memory ««ssibte by to compute, system; 
extracting wrds from at least, portion of to document; 

. ..I rwr ^'-^'""^ ,,n.rhchninintt..-t^.'^^4.hc 

"^natn^ting a state vector in to topoiogicai vector space for eacK «oM uair« a*. 

basis; 

filtering the state vectors; and 

collecting the filtered state vectors into the dominant vectors for the document. 

6. (Original) A method according to claim 5. wherein extracting words 
includes extracting words iTom the entire document. 

7 (Original) A method according to claim 5, wherein filtering the state 
vectors includes selecting the state vectors that occur with highest frequencies. 

8. (Original) A method according to claim 5. wherein filtering the state 

vectors includes; , 
calculating a centroid in the topological vector space for the state vectors; and 

selecting the state vectors nearest the centroid. 

9. (Original) A method according to claim 5. the method further comprising 
forming a semantic abstract comprising the dominant vectors. 

1 0 (Original) A computer-readable medium containing a program to 
determine dominant vectors in a topological vector space for a semantic content of a 
document on a computer system, the program being executable on the computer system to 
implement the method of claim 5. 

n (Currently Amended) A method for determining a semantic abstract in a 
topological vector space for a semantic content of a document on a computer system, the 
method comprising: 
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:....| ^.n. . H^rected of ronc ep^. r n .tirT ion.ry th^ ciirept.d s.t in cluding^ 
^- r >^ '^^^t .ne conce pt .nd k^st on. .h^ jair m ^h^ m.x^^ .. elen...t 

tn every concept; 

..i^Kn p . .uh.et of 111. ■ hii r tn fnrm ^ b^^^s for th. diQ^cnaryj 
storing the document in computer memory accessible by the computer system; 
determining dominant p l u oDO vectors e hrassiLfor the document; 

ha^U and the dictionarvi vi^^^mant 
rhain in the dictionary arid the basU; 

'^^'^1 i 1 iiiiriT ..■■■.■.,.n.w..vccu,,sP»6^foua^^;ii»a^^ 

'^nemins.hesan^Ucabs.ac.usine^e.ton.inamph^vcc.ona.dmedon.i,.™. 



vectors. 



12 (Origiml) A method «»ording to elaim 1 1. wherein genenttng the 
sem«.tlc abstract indudes redt«ing .he dominant phr.^ ve«o« based on the dominant 



vectors. 



13 (Original) A method a«o«Ung to claim 11. whe»in senenttins the 
semantic abs,ra« includes n=d„eing the dominant vector based on the dominant phme 



vectors. 



14 (Original) A method according to claim 11, Whereto generating the 
sem^dc abstract mcludes obtaining a probability dUtrtbution function tor a ■'^J^ 
.he dominant phrase vector, simila, to a probabihty distribuUon toetion for the dommant 



semantic 
the domi 
phrase vectors 
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15. (Original) A method according to claim U. the method further comprising 
identifying the lexemes or lexeme phmses corresponding to state vectors in the semantic 
abstract. 

16. (Original) A compmer-readable medium containing a program to 
determine a semantic abstract in a topological vector space for a semantic content of a 
document on a computer system, the program being executable on the computer system to 
implement the method of claim 11. 

\ 7. (Currently Amended) A method for comparing the semantic content of first 
and second documents on a computer system, the method comprising: 

M.ntSfvin« a dire-t^^ r.f concen ts f»s adictW OflT/ the diKCted set incmqin^ a 
n..vi^.l element ^y'^ <>t l...t one concept. at le4>M one rhM r from the maximal elemex^t 
tn evftry concept; : 

g ^lPf tintT a sub riTt "^'^'^ f'"*"" ^ ^"''^^ ^"^ dictionary; 

^.n,;n..t phrases for the fint do rument the donijnant phra^^s representinR 

a condensed content Fo r the first document 

u... ..oncretelv e.'-h T^--''"^"* phrase for the first docWT^nt is represented 

in each chain in the ba.sio f|pH the dictionai-v: 

.^.,t^..tm^ at le .^^ '^-^^ vector for the iirst document in th < ? topological vector 
cp... fo. ..eh domi--»^ r^r... for the first document us^p^ the me.< ^pres of hpw concretely 
...H Hon.in.nt nh ^o^.^ .he fir.t document is reprgf^epted in e f^ r h rh.jn jn the dictjpp^ and 
the basis: 

^ ,tin,T Thr Mn- --^-^ n ^'^'^ fi"'^ document into the synii^T^tic abstract for the 
first document; 

determining a.semantio abstfaetfr-absHafiLtor the fes^aftd^econd 4e^^me^ 
document; 

measuring a distance between the semantic abstracts; and 

classifying how closely related tlie first and second documents are using the distance. 

1 8. (Original) A method according to claim 17, wherein measuring a distance 
includes measuring a Hausdorff distance between the semantic abstracts. 



^ .,^T Aa/Ai -7-5/; 5 Docket No.: 6647-13 

Serial No.: 09/615,726 ^ 

PAGE 8/36* RCVD AT 11/212004 4:27:20 PM [Eastern Standard Time] ' SVR:USPT0{FXRF-1/2' DNIS:8729306* CSID:S032744622* DURATION (inin-ss):10-64 



NOV-02-2004 TUE 01:33 PM MARGER JOHNSON 



FAX NO. 5032744622 



19 (Original) A method according to claim 17, wherein measuring a distance 
includes d«.termining a centroid vector in the u^pological vector space for each semantic 
abstract. 

20. (Original) A method according to claim 19. wherein measuring a distance 
further includes measuring an angle between the centroid vectors. 

21. (Original) A method according to claim 19. wherein measuring a distance 
further includes measuring a Euclidean distance between the centroid vectors. 

22 (Original) A computer-readable medium containing a program to compare 
the semantic content of first and second documents on a computer system, the program being 
executable on the computer system to implement the method of claim 17. 

23. (Currently Amended) A method for locating a second document on a 
computer with a semantic content similar to a first document, the method comprising: 

dclu i milling a mimantio abbtiaot fof tho fiiiit dooumontj 

i^.nHfv^na H direct ^ f^ ^'^^ ^.nncents dicrionary. Xh^ Hir,>cM set includinR a 
.i.n,ent anC t^^c. nn. .nnccot .nd »t Icast one chain from t^ i r maximal element 

te\ f.Mery concept: 

f,M»c;tine a subset c^fthe chains \r, form a basis for the dictionary; 

».....lne domir <i pT pt^-...^ for the first rtOQument, thn ^oniinapt phrases representing 
^ rnndensed r .rtntgnt for tb« fir«.t dncument: 

^ . ^.„n-no hnw cor -->^-' Y f »"->^ Hnmin.nT phrase for the fiffit .lon.mcnt is represented 

i q each chain in the b fl^^s ^nd the dictiot^ary; 

.t lea s^ ct... v.ctnr for Ihr first document in the, tnpolo^c^l vector 
...y. ^nn.in«nt n h^^c. fo. the first dornm..nt nsii»,. the me a sures of how concretely 
phr... for th e first .lornment js repr ^ mM i^i e^ffb chain in TH^^ dictionary and 

tlie basis: 

.nllccrina the '^t^'^ vectors for fi^^t document into the semantic abstract fgr the 

first document; 

locating a second docimient; 

determining a semantic abstract for the second document; 
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measunng 



inc a distance between the semantic abstracts for the first and second 



docvunents; 

classifying how closely related the first and second documents are using the distance; 

and , 

if the second document is classified as having a semantic content suntlar to the 

semantic content of the first document, selecting the second document. 

24 (Original) A method according to claim 23. the method further 
comprising, if the second document is classified as not having a semantic content similar to 
the semantic content of the first document, rejecting the second document. 

25 (CunenUy Amended) An apparatus on a computer system to detennine a 
semantic abstract in a topological vector space for a semantic content of a document stored 
on the computer system, the apparatus comprising: 

a phmse extractor adapted to extract phrases from the document; 
a state vector constructor adapted to construct «t4«a5t^state vectors in the 
topological vector space for each phrase extracted by the phrase e>.,.aete^xn.c,t.> r. the state 
....^. ^ ^e..urin. how rnr ^r r.r.\ y e,ch phm -e .xtn.rtr n by the phrase extrac ^lMS 

e».h chain in a_b. , i n rj i rtinnnn rh. d^ction.r v includin P a directed set p f 
, ^»vi^.l elem. I ■■ > nnn rhnin from th. m axima) .lemen t to 

.....^ ..... pt the directed set^th^L basi^l^^ ^'^ ^ '^"^'"''^ 

collection means for collecting the state vectors into the semantic abstract for the 

docviment. 

26 (Original) An apparatus according to claim 25. the apparatus further 
comprising filter means for filtering the state vectors to reduce the size of the semantic 
abstract 

27 (Original) An apparatus according to claim 25. wherein the state vector 
constructor is farther adapted to construct a state vector for each word in the document. 

28. (Canceled) 
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29. (Currently Amended) A method for determining a semantic abstract in a 
topological vector space for a semantic content of a document on a computer system, the 

method comprising: 

extracting dominant phrases from the document using a phrase extractor, the 

dominant phrases representing a condensed content for the document; 

;H.„nfV^n p . directed ^^^»n..nt« as a dictionary, th^ directed set including s 
...^.r.. ...A l...t ^^r- T' ""^ lean nnr rh.in from the TnaximM element 

to every concept; 

^Plftctina a sub?:>^ the chain. « tn Form a basis for the difitjonary; 
^,„o,.Tin p 1,0V. ^onr '^^-^ Y ^'^'""^'^ represent^ in ^^ch cMn i" the 

tyasis and the dictionarvi 

constructing at least one first state vector in the topological vector space for each 
dominant phrase using ^.he m^f^^nres of tiovr rnncrtHv each dominant phm^e is represented 
| p chtLin in tl^e d ictionary and arttia.basis; 

collecting the first state vectors into dominant phrase vectors for the document; 

extracting words from at least a portion of the document; 

constructing a second state vector in the topological vector space for each v^ord using 

the dictionary and the basis; 

filtering the second state vectors; 

coUcctine the filtered second state vectors into dominant vectors for the document; 

and 

generating the semantic abstract using the dominant phrase vectors and the dommant 

vectors, 

30 (Original) A method according to claim 29. the method further comprising 
comparing the semantic abstract with a second semantic abstract for a second document to 
determine how closely related the contents of the documents are. 

3 1 . (New) A method according to claim I . wherein: 

measuring hovr concretely each dominant phrase is represented in each chain in the 

basis and the dictionary includes: 

identifying at least one lexeme in each dominant phrase; and 
measuring how concretely each lexeme in each dominant phrase is represented 
in each chain in the basis and the dicttonary; and 
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constructing at least one state vector in the topological vector space for each dominant 
phrase includes constructing at least one state vector in the topological vector space for each 
lexeme in each dominant phrase using the measures of how concretely each lexeme m each 
dominant phrase is represented in each chain in the basis and the dictionary. 

32. (New) A method according to claim 17. wherein determining a semantic 
abstract for the second document includes: 

accessing dominant phrases for the second document, the dominant phrases 
representing a condensed content for the second document; 

measuring how concretely each dominant phrase for the second document is 

represented in each chain in the basis and the dictionary: 

constructing at least one state vector for the second document in the topological 

vector space for each dominant phrase for the second document using the measures of how 
concretely each dominant phrase for the second document is represented in each cham m the 

dictionary and the basis; and 

collecting the state vectors for the second document into the semantic abstract for the 

second document. 

33. (New) A method according to claim 23, wherein determining a semantic 

abstract for the second document includes: 

accessing dominant phrases for the second document, the dominant phrases 

representing a condensed content for the second document; 

measuring how concretely each dominant phrase for the second document is 
represented in each chain in the basis and the dictionary; 

constructing at least one state vector for the second document in the topological 
vector space for each dominant phrase for the second document using the measures of how 
concretely each dominant phrase for the second documem is represented in each cham m the 

dictionary and the basis; and 

collecting the state vectors for the second document into the semantic abstract for the 

second document 

34 (New) A method according to claim 29, wherein constructing a second state 
vector in the topological vector space for each word using the dictionary and the basis 
includes: 
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measuring how concrttely each word is represented in each chain in the basis and the 

dictionary; and 

constructing the second state vectors in the topological vector space for each word 
using the measures of how concretely each word is represented in each chain in the dictionary 
and the basis. 

35. (New) A method according to claim I , wherein constructing at least one state 
vector includes constructing the state vectors in the topological vector space for each 
dominant phrase using the measures of how concretely each dominant phrase is represented 
in each chain in the dictionary and the basis, the state vectors independent of the document. 

36. (New) A method according to claim 5. wherein constructing a state vector 
includes constructing the state vector in the topological vector space for each word using the 
measures of how concretely each word is represented in each chain in the dictionary and the 
basis, the state vectors independent of the document. 

37. (New) A method according to claim 1 1, wherein constructing dominant 
vectors includes constructing dominant vectors in the topological vector space for the words 
using the measures of how concretely each word is represented in each chain in the dictionary 
and the basis, the dominant vectors independent of the document. 

38. (New) A method according to claim 17, wherem constructing at least one 
state vector includes constructing the state vectors for the first document in the topological 
vector space for each dominant phrase for the first document using the measures of how 
concretely each dominant phrase for the first document is represented in each chain in the 
dictionary and the basis, the state vectors independent of the first document and the second 
document. 

39. (New) A method according to claim 12. wherein constructing at least one 
state vector includes constructing at least one state vector for the first document in the 
topological vector space for each dominant phrase for the first document using the measures 
of how concretely each dominant phrase for the first document is represented in each chain in 
the dictionary and the basis, the state vectors independent of the first document and the 
second document. 
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40. (New) An apparatus according to claim 25, wherein the state vector 
constructor is operative to construct the state vectors independent of the document. 

4 1 (New) A method according to claim 29. wherein: 

constructing at least one first state vector includes constructing the first sutc vector in 
the topological vector space for each dominant phrase using the measures of how concretely 
each dominant phrase is represented in each chain in the dictionary and the basis, the first 
state vector independent of the document; and 

constructing a second sUte vector includes constructing the second state vector in the 
topological vector space for each word using the dictionary and the basis, the second state 
vector independent of the document. 
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